feat!: Add per-execution runId, at-most-once tracking, and cross-process tracker resumption by jsonbailey · Pull Request #133 · launchdarkly/python-server-sdk-ai

jsonbailey · 2026-04-15T14:28:45Z

Summary

Per-execution runId: Every tracker now includes a unique runId (UUID) in all track event payloads, enabling billing isolation per execution
At-most-once semantics: Each metric type (duration, tokens, success/error, feedback, time-to-first-token) can only be tracked once per tracker instance — subsequent calls are silently dropped with a log warning
create_tracker() factory on config objects: AICompletionConfig, AIAgentConfig, and AIJudgeConfig now carry an optional create_tracker callable that returns a fresh LDAIConfigTracker with a new runId each time it's called. Set to None when the config is disabled.
Per-invocation trackers in managed classes: ManagedModel.invoke(), ManagedAgent.run(), and Judge.evaluate() now call create_tracker() at the start of each invocation to get a fresh tracker, fixing the multi-turn tracking issue where at-most-once guards blocked metrics from second+ invocations
resumption_token property on tracker: URL-safe Base64-encoded (no padding) JSON string containing {runId, configKey, variationKey, version} for cross-process tracker reconstruction
LDAIClient.create_tracker(token, context): Reconstructs a tracker from a resumption token for deferred feedback scenarios. Validates required fields and raises ValueError for invalid tokens.

Test plan

Enabled config has create_tracker callable; disabled config has None
Each create_tracker() call returns a new tracker with a distinct runId
Factory closure captures correct flag metadata (configKey, variationKey, version, modelName, providerName)
ManagedAgent.run() uses create_tracker() when available, falls back to stored tracker
Resumption token round-trip encode/decode preserves all fields
Resumption token has no base64 padding characters
create_tracker(token, context) reconstructs tracker with original runId and empty model/provider
Invalid base64, invalid JSON, and missing required fields all raise ValueError
All 137 existing + new tests pass with no regressions

🤖 Generated with Claude Code

Note

Medium Risk
Touches core SDK tracking primitives and all runner integrations, so regressions could affect metrics emission and correlation (especially around tracker factory lifetimes and duplicate-suppression behavior).

Overview
This PR reworks tracking to be per execution by replacing stored tracker instances on configs/graphs with create_tracker() factories, and updating managed wrappers (ManagedModel, ManagedAgent, Judge, ManagedAgentGraph) and both LangChain/LangGraph and OpenAI agent-graph runners to create (and in OpenAI, cache) trackers per run/node.

It also adds a runId to every LDAIConfigTracker event, enforces at-most-once semantics for key metric types (dropping duplicates with warnings), and introduces tracker resumption via LDAIConfigTracker.resumption_token + LDAIClient.create_tracker(token, context) for deferred/cross-process feedback. Tests are updated broadly to use factories, validate runId consistency/uniqueness, ensure graph/node tracking uses the correct single tracker per run, and cover resumption-token encoding/decoding and error handling.

^{Reviewed by Cursor Bugbot for commit 35aef5e. Bugbot is set up for automated code reviews on this repo. Configure here.}

- Each tracker now carries a runId (UUIDv4) included in all emitted events, scoping every metric to a single execution - At-most-once semantics: duplicate calls to track_duration, track_tokens, track_success/track_error, track_feedback, and track_time_to_first_token on the same tracker are dropped with a warning Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ess tracker resumption Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…osure The run_id parameter on LDAIConfigTracker is now required (no default). UUID generation happens in the tracker_factory closure in client.py, keeping the tracker itself a plain data holder. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Break long tuple lines in client.py to stay under 120 char limit - Add required run_id parameter to LDAIConfigTracker calls in openai and langchain provider tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove the redundant _tracked dict from LDAIConfigTracker. The summary already stores each metric with None as the unset sentinel, so the nil-check on summary properties serves as the at-most-once guard. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

New order: ld_client, run_id, config_key, variation_key, version, model_name, provider_name, context, graph_key. All call sites converted to keyword arguments for resilience against future reorders. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…oken Reorder LDAIConfigTracker.__init__ to match updated spec: context now comes before model_name and provider_name. Also fix resumption_token to omit variationKey from the JSON when it is empty, and handle the absent key when reconstructing from a token. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

All six at-most-once guard warnings in tracker.py now log the track data dict (runId, configKey, etc.) to aid debugging duplicate-track scenarios. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Move the resumption token decoding logic from LDAIClient.create_tracker into a classmethod on LDAIConfigTracker per spec 1.1.20.2. The client method now delegates to LDAIConfigTracker.from_resumption_token. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Match the resumption token behavior: only include variationKey in the track data dict when it has a non-empty value. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The create_tracker field on AIConfig is now always a callable that returns a working tracker, even when the config is disabled. The factory is always set to tracker_factory — callers use the enabled flag to decide whether to proceed, not the factory result. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

BREAKING CHANGE: The `tracker` field has been removed from all config dataclasses (AICompletionConfig, AIJudgeConfig, AIAgentConfig). Users must now call `config.create_tracker()` to obtain a tracker instance. ManagedModel and ManagedAgent no longer accept a tracker constructor parameter — they call `create_tracker()` from the config on each invocation. The `__evaluate` return tuple no longer includes a pre-created tracker. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add graphKey to the resumption token following the spec key order: runId, configKey, variationKey (if set), version, graphKey (if set). The from_resumption_token classmethod now decodes and passes graphKey to the tracker constructor. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Judge now calls self._ai_config.create_tracker() per evaluate() invocation instead of receiving a tracker at construction time. ManagedAgentGraph no longer stores or exposes a tracker. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace logging.getLogger(__name__) with the SDK's shared log instance (from ldai import log) for consistency with the rest of the codebase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Migrate langchain and openai provider packages from config.tracker to config.create_tracker() and fix test signatures to match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… factory Per AIGRAPH spec 1.4.3, AgentGraphDefinition now has a create_tracker callable that returns a new AIGraphTracker per invocation instead of storing a pre-created instance. Removes get_tracker() method entirely. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

_flush_final_segment and _track_tool_calls were each calling create_tracker() independently, generating new runIds that broke per-execution event correlation. Now build_node creates one tracker per node, cached in _node_trackers, and reused by all tracking methods. Adds test_same_run_id_across_token_success_and_tool_call_events to verify all node-level events for a single execution share one runId. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

run() and _build_agents() each called create_tracker() on the graph, producing two tracker instances. Now run() creates the tracker once and passes it to _build_agents() so handoff callbacks and run-level tracking share the same instance. Tests now assert graph.create_tracker is called exactly once per run and node create_tracker is called exactly once per node. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

from_resumption_token and LDAIClient.create_tracker now return ldclient.Result instead of raising ValueError on invalid tokens, letting callers handle errors without try/except. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Change AgentGraphDefinition.create_tracker from Callable[[], AIGraphTracker] with default lambda: None to Optional[Callable[[], AIGraphTracker]] with default None. Guard call sites in both runners with `is not None` before invoking. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The disabled() factory on AIConfigDefault and subclasses created configs without tracker factories, breaking the spec requirement. Replace with private module-level constants in client.py, matching how js-core handles disabled configs as an internal concern. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Raise a clear RuntimeError if create_tracker returns None rather than letting it crash with AttributeError on track_metrics_of_async. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Cache node trackers in langgraph_callback_handler flush() to avoid creating multiple trackers per node with different runIds - Read graph key directly from config instead of instantiating a tracker just for debug logging in langgraph_agent_graph_runner - Simplify redundant except (json.JSONDecodeError, Exception) to except Exception in tracker.py from_resumption_token Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

AIConfig.create_tracker is now a required field with no default value. The SDK client always injects a real tracker factory, so any direct construction of AIConfig subclasses must now provide one explicitly. This eliminates the entire class of null-safety issues around tracker factories. Reverts the RuntimeError guard in Judge.evaluate() since it is no longer needed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Convenience factory for the common fallback case. Added to AIConfigDefault, AICompletionConfigDefault, AIAgentConfigDefault, and AIJudgeConfigDefault. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace private _DISABLED_*_DEFAULT constants and inline AIXxxConfigDefault(enabled=False) calls with the new disabled() classmethod. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…t overrides The base AIConfigDefault.disabled() already uses cls(), so subclass overrides were unnecessary. Use Self return type annotation for correct narrowing and remove the three identical overrides from AICompletionConfigDefault, AIAgentConfigDefault, AIJudgeConfigDefault. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

mypy targets python_version = "3.10" but typing.Self was added in 3.11. Use unconditional typing_extensions import and remove the unnecessary from __future__ import annotations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Change `if self._graph_key is not None` to `if self._graph_key` so that empty string is treated as "not set", matching the truthy check already used in resumption_token. Prevents round-trip data loss where a tracker with graph_key="" emits graphKey in events but omits it from the token. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 35aef5e. Configure here.}

cursor · 2026-04-21T14:32:29Z

+            config_tracker = node.get_config().create_tracker()
            if not config_tracker:
                continue
+            node_trackers[node_key] = config_tracker


Flush always tracks success even for failed nodes

Medium Severity

The flush() method unconditionally calls config_tracker.track_success() for every node in the execution path. Combined with the new at-most-once guard on track_success/track_error, this means that if a node actually failed, track_success() is still called first and locks out any subsequent track_error() call. With the old code (no at-most-once guards), external code could potentially override this, but now the guard permanently records success for every flushed node regardless of actual outcome.

^{Reviewed by Cursor Bugbot for commit 35aef5e. Configure here.}

We need to rework some of this logic and will do it in a separate PR. This is currently experimental.

jsonbailey changed the title ~~feat!: Add per-execution runId and at-most-once event tracking~~ feat!: Add per-execution runId, at-most-once tracking, and cross-process tracker resumption Apr 15, 2026

jsonbailey and others added 2 commits April 16, 2026 11:03

feat!: Add per-execution runId, at-most-once tracking, and cross-proc…

211ead4

…ess tracker resumption Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

jsonbailey force-pushed the jb/aic-2207/update-ai-sdks-billing-spec branch from bdf7384 to 211ead4 Compare April 16, 2026 16:48

jsonbailey and others added 13 commits April 16, 2026 12:46

fix: Fix CI lint errors and add run_id to provider tests

d895e64

- Break long tuple lines in client.py to stay under 120 char limit - Add required run_id parameter to LDAIConfigTracker calls in openai and langchain provider tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chore: Include track data in at-most-once warning logs

59c574e

All six at-most-once guard warnings in tracker.py now log the track data dict (runId, configKey, etc.) to aid debugging duplicate-track scenarios. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: Omit variationKey from track data when empty

ba5421a

Match the resumption token behavior: only include variationKey in the track data dict when it has a non-empty value. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chore: Use ldai.log instead of logging module in tracker

08da63a

Replace logging.getLogger(__name__) with the SDK's shared log instance (from ldai import log) for consistency with the rest of the codebase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

jsonbailey commented Apr 17, 2026

View reviewed changes

Comment thread packages/sdk/server-ai/src/ldai/tracker.py Outdated

fix: Update provider packages for tracker factory pattern

ae5d752

Migrate langchain and openai provider packages from config.tracker to config.create_tracker() and fix test signatures to match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

jsonbailey marked this pull request as ready for review April 17, 2026 16:56

jsonbailey requested a review from a team as a code owner April 17, 2026 16:56

cursor Bot reviewed Apr 17, 2026

View reviewed changes

Comment thread packages/ai-providers/server-ai-openai/src/ldai_openai/openai_agent_graph_runner.py Outdated

andrewklatzke approved these changes Apr 17, 2026

View reviewed changes

jsonbailey and others added 3 commits April 17, 2026 14:56

cursor Bot reviewed Apr 17, 2026

View reviewed changes

Comment thread packages/sdk/server-ai/src/ldai/tracker.py

Comment thread packages/ai-providers/server-ai-openai/src/ldai_openai/openai_agent_graph_runner.py Outdated

cursor Bot reviewed Apr 17, 2026

View reviewed changes

Comment thread packages/sdk/server-ai/src/ldai/judge/__init__.py

jsonbailey and others added 2 commits April 17, 2026 17:27

cursor Bot reviewed Apr 17, 2026

View reviewed changes

Comment thread packages/ai-providers/server-ai-langchain/src/ldai_langchain/langgraph_callback_handler.py

Comment thread packages/ai-providers/server-ai-langchain/src/ldai_langchain/langgraph_agent_graph_runner.py Outdated

Comment thread packages/sdk/server-ai/src/ldai/tracker.py

fix: Guard against None tracker in Judge.evaluate()

edd1690

Raise a clear RuntimeError if create_tracker returns None rather than letting it crash with AttributeError on track_metrics_of_async. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cursor Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread packages/sdk/server-ai/src/ldai/managed_agent.py

jsonbailey and others added 4 commits April 20, 2026 15:05

feat: Add disabled() classmethod to all AIConfigDefault variants

db25c8f

Convenience factory for the common fallback case. Added to AIConfigDefault, AICompletionConfigDefault, AIAgentConfigDefault, and AIJudgeConfigDefault. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

refactor: Use disabled() classmethod for fallback defaults in client

0aeb164

Replace private _DISABLED_*_DEFAULT constants and inline AIXxxConfigDefault(enabled=False) calls with the new disabled() classmethod. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cursor Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread packages/sdk/server-ai/src/ldai/agent_graph/__init__.py Outdated

chore: Remove unused DEFAULT_FALSE constant from agent_graph

d721142

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cursor Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread packages/sdk/server-ai/src/ldai/models.py Outdated

cursor Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread packages/sdk/server-ai/src/ldai/tracker.py

jsonbailey and others added 2 commits April 21, 2026 09:22

cursor Bot reviewed Apr 21, 2026

View reviewed changes

jsonbailey merged commit 68685cd into main Apr 21, 2026
46 checks passed

jsonbailey deleted the jb/aic-2207/update-ai-sdks-billing-spec branch April 21, 2026 16:16

github-actions Bot mentioned this pull request Apr 20, 2026

chore: release main #136

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat!: Add per-execution runId, at-most-once tracking, and cross-process tracker resumption#133

feat!: Add per-execution runId, at-most-once tracking, and cross-process tracker resumption#133
jsonbailey merged 31 commits intomainfrom
jb/aic-2207/update-ai-sdks-billing-spec

jsonbailey commented Apr 15, 2026 •

edited by cursor Bot

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot Apr 21, 2026

Uh oh!

jsonbailey Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jsonbailey commented Apr 15, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Apr 21, 2026

Choose a reason for hiding this comment

Flush always tracks success even for failed nodes

Uh oh!

jsonbailey Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jsonbailey commented Apr 15, 2026 •

edited by cursor Bot

Loading